In this exercise, we are going to study the coverage of coral and algae measured in percentage in the Great Barrier Reef (Australia) at different times and locations. To do that, we are given two data sets, coral1 and coral2, that contain information on the reef identifiers (REEF_NAME and REEF_ID), the reef locations (SECTOR, SHELF, LATITUDE and LONGITUDE), and the Coverage (in percentage) of Algae and Hard Coral (stored in the variable Groups) for each of the locations and times. You can find the data sets for this exercise in the folder called Data.
In addition to the marks displayed in each question, an additional 15 points have been allocated for assessment of general coding style and overall performance.
Please ensure that the report knits properly and all the R code is visible in the final knitted report.
This is an individual assignment
coral1 <- read.csv(here::here("data/coral1.csv"))
head(coral1,2)%>%
kable() %>%
kable_styling(bootstrap_options = c("basic","striped","hover"))
| SECTOR | SHELF | REEF_NAME | REEF_ID | SITE_NO | LATITUDE | LONGITUDE | SAMPLE_DATE |
|---|---|---|---|---|---|---|---|
| CA | I | LOW ISLANDS REEF | 16028S | 1 | -16.38352 | 145.5710 | 1993-06-12 |
| CA | I | LOW ISLANDS REEF | 16028S | 2 | -16.38648 | 145.5726 | 1993-06-12 |
names(coral1)
## [1] "SECTOR" "SHELF" "REEF_NAME" "REEF_ID" "SITE_NO"
## [6] "LATITUDE" "LONGITUDE" "SAMPLE_DATE"
coral2_0 <- read.csv(here::here("data/coral2.csv"))
head(coral2_0,2) %>%
kable() %>%
kable_styling(bootstrap_options = c("basic","striped","hover"))
| REEF_ID | SITE_NO | Algae | Hard.Coral |
|---|---|---|---|
| 16028S | 1 | 30.64 | 24.77 |
| 16028S | 2 | 35.83 | 29.91 |
coral2 <- coral2_0 %>%
pivot_longer(cols = 3:4,
names_to = "Groups",
values_to = "Coverage")
head(coral2,2) %>%
kable() %>%
kable_styling(bootstrap_options = c("basic","striped","hover"))
| REEF_ID | SITE_NO | Groups | Coverage |
|---|---|---|---|
| 16028S | 1 | Algae | 30.64 |
| 16028S | 1 | Hard.Coral | 24.77 |
coral <- left_join(coral2,coral1,by = c("REEF_ID","SITE_NO"))
head(coral,2) %>%
kable() %>%
kable_styling(bootstrap_options = c("basic","striped","hover"))
| REEF_ID | SITE_NO | Groups | Coverage | SECTOR | SHELF | REEF_NAME | LATITUDE | LONGITUDE | SAMPLE_DATE |
|---|---|---|---|---|---|---|---|---|---|
| 16028S | 1 | Algae | 30.64 | CA | I | LOW ISLANDS REEF | -16.38352 | 145.571 | 1993-06-12 |
| 16028S | 1 | Algae | 30.64 | CA | I | LOW ISLANDS REEF | -16.38352 | 145.571 | 1995-04-15 |
coral <- coral %>%
mutate(Year = year(SAMPLE_DATE),
Month = month(SAMPLE_DATE))
The years present in datafrram coral are 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018
coral %>%
group_by(Groups,SECTOR,Year) %>%
summarise(Avg.coverage = mean(Coverage)) %>%
head(2) %>%
kable() %>%
kable_styling(bootstrap_options = c("basic","striped","hover"))
| Groups | SECTOR | Year | Avg.coverage |
|---|---|---|---|
| Algae | CA | 1993 | 54.67876 |
| Algae | CA | 1994 | 54.62692 |
Q5grqph <- coral %>%
group_by(Groups,SECTOR,Year) %>%
summarise(Avg.coverage = mean(Coverage)) %>%
filter(Groups == "Algae") %>%
ggplot(aes(Year,
Avg.coverage,
color = SECTOR)) +
geom_line() +
geom_point() +
facet_wrap(~SECTOR)
ggplotly(Q5grqph)
cairns_sector <- coral %>%
filter(SECTOR == "CA")
head(cairns_sector,2) %>%
kable() %>%
kable_styling(bootstrap_options = c("basic","striped","hover"))
| REEF_ID | SITE_NO | Groups | Coverage | SECTOR | SHELF | REEF_NAME | LATITUDE | LONGITUDE | SAMPLE_DATE | Year | Month |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 16028S | 1 | Algae | 30.64 | CA | I | LOW ISLANDS REEF | -16.38352 | 145.571 | 1993-06-12 | 1993 | 6 |
| 16028S | 1 | Algae | 30.64 | CA | I | LOW ISLANDS REEF | -16.38352 | 145.571 | 1995-04-15 | 1995 | 4 |
The dimensions of the dataframe cairns_sector are 27014, 12.
There are 11 unique Reefs in the cairns_sector.
cairns_sector %>%
group_by(Groups,REEF_NAME,Year) %>%
summarise(Avg.coverage = mean(Coverage)) %>%
head(2) %>%
kable() %>%
kable_styling(bootstrap_options = c("basic","striped","hover"))
| Groups | REEF_NAME | Year | Avg.coverage |
|---|---|---|---|
| Algae | AGINCOURT REEFS (NO 1) | 1994 | 48.94733 |
| Algae | AGINCOURT REEFS (NO 1) | 1995 | 48.94733 |
Q7graph <- cairns_sector %>%
group_by(Groups,REEF_NAME,Year) %>%
summarise(Avg.coverage = mean(Coverage)) %>%
ggplot(aes(Year,
Avg.coverage,
group = Groups,
color = Groups)) +
geom_line() +
facet_wrap(~REEF_NAME)
ggplotly(Q7graph)
Avg,Coverage of groups in entire cairns_sector over the years
The Figure above showcases that:
cairns_sector %>%
select(REEF_NAME,Groups,Coverage,Year) %>%
filter(Groups == "Hard.Coral" & Year == 2017) %>%
group_by(REEF_NAME) %>%
summarise(Avg.coverage = mean(Coverage)) %>%
slice_max(Avg.coverage) %>%
kable() %>%
kable_styling(bootstrap_options = c("basic","striped","hover"))
| REEF_NAME | Avg.coverage |
|---|---|
| AGINCOURT REEFS (NO 1) | 26.87573 |
cairns_sector %>%
filter(Year == 1993 | Year == 2017) %>%
group_by(Groups,SHELF,Year) %>%
summarise(Avg_coverage = mean(Coverage)) %>%
arrange(Year,desc(Avg_coverage)) %>%
kable() %>%
kable_styling(bootstrap_options = c("basic","striped","hover"))
| Groups | SHELF | Year | Avg_coverage |
|---|---|---|---|
| Algae | I | 1993 | 61.21745 |
| Algae | M | 1993 | 55.87581 |
| Algae | O | 1993 | 42.12000 |
| Hard.Coral | M | 1993 | 23.67801 |
| Hard.Coral | O | 1993 | 22.44467 |
| Hard.Coral | I | 1993 | 15.83418 |
| Algae | M | 2017 | 59.31029 |
| Algae | O | 2017 | 45.53367 |
| Hard.Coral | O | 2017 | 24.66020 |
| Hard.Coral | M | 2017 | 22.11563 |
cairns_sector %>%
filter(Year == 2000 & Groups == "Algae") %>%
group_by(REEF_NAME) %>%
summarise(No_of_Observation = n()) %>%
arrange(-No_of_Observation) %>%
kable() %>%
kable_styling(bootstrap_options = c("basic","striped","hover"))
| REEF_NAME | No_of_Observation |
|---|---|
| HASTINGS REEF | 78 |
| AGINCOURT REEFS (NO 1) | 75 |
| ST CRISPIN REEF | 75 |
| THETFORD REEF | 72 |
| GREEN ISLAND REEF | 57 |
| MICHAELMAS REEF | 57 |
| FITZROY ISLAND REEF | 54 |
| LOW ISLANDS REEF | 53 |
| MACKAY REEF | 51 |
| OPAL (2) | 51 |
reef_top <- cairns_sector %>%
filter(Year == 2000) %>%
group_by(REEF_NAME) %>%
summarise(No_of_Observation = n()) %>%
slice_max(No_of_Observation)
Reef with the highest number of observations in the year 2000 is HASTINGS REEF
cairns_wider <- coral2_0
cairns_wider %>%
ggplot(aes(Algae,
Hard.Coral)) +
geom_point() +
ggtitle("Relationship between Algae and Hard Coral Coverage in the Cairns Sector")
tidy(lm(Hard.Coral ~ Algae,data = cairns_wider)) %>%
kable() %>%
kable_styling(bootstrap_options = c("basic","striped","hover"))
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | 58.2209691 | 0.5584786 | 104.2492 | 0 |
| Algae | -0.5751255 | 0.0096048 | -59.8787 | 0 |
Tje Above table clearly states that there is a -ve correlation between the variables Algae and Harrd.Coral. And it also tells that the relation is fairly strong as value is approx -0.6.